Algebraic classifiers: a generic approach to fast cross-validation, online training, and parallel training

نویسنده

  • Michael Izbicki
چکیده

We use abstract algebra to derive new algorithms for fast cross-validation, online learning, and parallel learning. To use these algorithms on a classification model, we must show that the model has appropriate algebraic structure. It is easy to give algebraic structure to some models, and we do this explicitly for Bayesian classifiers and a novel variation of decision stumps called HomStumps. But not all classifiers have an obvious structure, so we introduce the Free HomTrainer. This can be used to give a “generic” algebraic structure to any classifier. We use the Free HomTrainer to give algebraic structure to bagging and boosting. In so doing, we derive novel online and parallel algorithms, and present the first fast crossvalidation schemes for these classifiers.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Online multiple people tracking-by-detection in crowded scenes

Multiple people detection and tracking is a challenging task in real-world crowded scenes. In this paper, we have presented an online multiple people tracking-by-detection approach with a single camera. We have detected objects with deformable part models and a visual background extractor. In the tracking phase we have used a combination of support vector machine (SVM) person-specific classifie...

متن کامل

Improving Automated Land Cover Mapping by Identifying and Eliminating Mislabeled Observations from Training Data

This paper presents a new approach to identifying and eliminating mislabeled training samples. The goal of this technique is to decrease the error of classification algorithms by improving the quality of the training data. The approach employs an ensemble of classifiers that serve as a filter for the training data. Using an n-fold cross validation, the training data is passed through the filter...

متن کامل

Online Kernel Selection: Algorithms and Evaluations

Kernel methods have been successfully applied to many machine learning problems. Nevertheless, since the performance of kernel methods depends heavily on the type of kernels being used, identifying good kernels among a set of given kernels is important to the success of kernel methods. A straightforward approach to address this problem is cross-validation by training a separate classifier for e...

متن کامل

Identifying the Mislabeled Training Samples of ECG Signals using Machine Learning

The classification accuracy of electrocardiogram signal is often affected by diverse factors in which mislabeled training samples issue is one of the most influential problems. In order to mitigate this negative effect, the method of cross validation is introduced to identify the mislabeled samples. The method utilizes the cooperative advantages of different classifiers to act as a filter for t...

متن کامل

HLearn: A Machine Learning Library for Haskell

HLearn is a Haskell-based library for machine learning. Its distinguishing feature is that it exploits the algebraic properties of learning models. Every model in the library is an instance of the HomTrainer type class, which ensures that the batch trainer is a monoid homomorphism. This is a restrictive condition that not all learning models satisfy; however, it is useful for two reasons. First...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013